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Abstract 

We present an overview of the Southern-sky MWA Rapid Two-metre (SMART) pulsar survey that exploits the Murchison Widefield Array's 
large field of view and voltage capture system to survey the sky south of 30? in declination for pulsars and fast transients in the 140-170 MHz 
band. The survey is enabled by the advent of the Phase II MWA’s compact configuration, which offers an enormous efficiency in beam-forming 
and processing costs, thereby making an all-sky survey of this magnitude tractable with the MWA. Even with the long dwell times employed for 
the survey (4800 s), data collection can be completed in <100 hours of telescope time, while still retaining the ability to reach a limiting sensitivity 
of ~2-3 mJy (at 150 MHz, near zenith), which is effectively 3-5 times deeper than the previous-generation low-frequency southern-sky pulsar 
survey, completed in the 1990s. Each observation is processed to generate ~5000-8000 tied-array beams that tessellate the full ~610 deg” field 
of view (at 155 MHz), which are then processed to search for pulsars. The voltage-capture recording of the survey also allows a multitude 
of post hoc processing options including the reprocessing of data for higher time resolution and even exploring image-based techniques for 
pulsar candidate identification. Due to the substantial computational cost in pulsar searches at low frequencies, the survey data processing is 
undertaken in multiple passes: in the first pass, a shallow survey is performed, where 10 minutes of each observation is processed, reaching 
about one-third of the full search sensitivity. Here we present the system overview including details of ongoing processing and initial results. 
Further details including first pulsar discoveries and a census of low-frequency detections are presented in a companion paper. Future plans 
include deeper searches to reach the full sensitivity and acceleration searches to target binary and millisecond pulsars. Our simulation analysis 
forecasts ~300 new pulsars upon the completion of full processing. The SMART survey will also generate a complete digital record of the 
low-frequency sky, which will serve as a valuable reference for future pulsar searches planned with the low-frequency Square Kilometre Array. 


Keywords: surveys: sky surveys - instrumentation: interferometers — methods: observational — pulsars: general — techniques: interferometric 


1. INTRODUCTION high-profile scientific applications (e.g., pulsar timing arrays 
for the detection of nanohertz-frequency gravitational waves) 
has elevated pulsar science to the ranks of a key science for the 


Square Kilometre Array (SKA; e.g., Keane et al., 2015; Janssen 


Even after five decades of productive research, pulsars continue 
to enable us to push the frontiers of physics and astrophysics. 


These compact dense stars harbour physical conditions that 
are non-existent elsewhere in the universe (e.g., ultra-strong 
gravitational and magnetic fields and supra-nuclear matter den- 
sities), which make them invaluable tools for studying extreme 
physics. They are arguably amongst the most widely-exploited 
astrophysical objects, with applications ranging from probing 
the state of ultra-dense matter to testing strong-field gravity 
(e.g., Demorest et al., 2010; Kramer et al., 2006; van Straten 
et al., 2001), and from probing micro-arcsecond structure and 
turbulence in the interstellar medium (ISM) to complex stel- 
lar evolutionary scenarios (e.g., Bhat et al., 2004; Archibald 
et al., 2009; Bailes et al., 2011). The phenomenal impact and 


et al., 2015; Shao et al., 2015). 


The backbone that enables this is the net result of a series of 
large pulsar surveys conducted over the past five decades (e.g., 
Manchester et al., 2001; Cordes et al., 2006; Keith et al., 2010; 
Stovall et al., 2014). Invariably, most of them involved tessellat- 
ing large parts of the sky of the instrument and recording data 
at high time and frequency resolutions (i.e., large data rates) 
and performing sensitive searches over the vast parameter space 
that is practically feasible. Many of them were prompted by 
the advent of new instrumentation or technology, and often 
exploited the computing affordable at the time. They have 


also proven invariably rewarding in the longer term, and often 
yielded a substantial increase in the pulsar population. For 
instance, the Molonglo pulsar survey in the 1970s found 150 
pulsars, practically doubling the known pulsar population at 
the time (Manchester et al., 1978), while the Parkes multibeam 
survey from the 1990s (Manchester et al., 2001) found 742 pul- 
sars, and discovered exotica such as the double pulsar system 
J0737-3039A/B and the eccentric neutron star-white dwarf 
binary J1141-6545, both of which have proven to be unique 
laboratories for testing general relativity and alternate theories 
of gravity (Kramer et al., 2006; Bhat et al., 2008; Venkatraman 
Krishnan et al., 2020; Kramer et al., 2021). This success led to 
next-generation multibeam surveys at Parkes and Arecibo, and 
more recently with the Five-hundred-meter Aperture Spher- 
ical radio Telescope (FAST). Already these have collectively 
discovered 600 pulsars to date. The landmark discovery of fast 
radio bursts (FRBs) in the Parkes high-time resolution radio 
universe (HTRU) survey (Thornton et al., 2013) even opened 
up an entirely new field of research. Large pulsar surveys 
have a proven track record of their ability to return signifi- 
cant scientific dividends in the long run, with the majority of 
the discoveries and spin-off science emerging from follow-up 
processing over the years. 

These multi-beam surveys have largely been at frequen- 
cies = 1GHz. The past decade also witnessed a number of 
successful low-frequency pulsar surveys, most of which were 
prompted by the advent of new-generation low-frequency 
facilities (e.g., Low Frequency Array; LOFAR), or new re- 
ceivers or pulsar instrumentation at the more traditional facili- 
ties such as the Green Bank Telescope (GBT) and the Giant 
Metre-wave Radio Telescope (GMRT). Notable among these 
are the drift-scan surveys with the Arecibo Telescope and 
GBT, and the ongoing surveys at the GMRT and GBT. The 
drift-scan surveys, in the 300-350 MHz range, despite their 
non-traditional nature, have led to > 100 pulsar discoveries, 
while the highly successful Green Bank Northern Celestial 
Cap (GBNCC) survey has, to date, found 160 pulsars. The net 
tally from the low-frequency surveys of the past decade alone is 
>400 pulsars, including 73 pulsars by the LOFAR Tied-Array 
All-Sky (LOTAAS) survey (Sanidas et al., 2019). Additionally, 
targeted searches have been undertaken toward unidentified 
Fermi gamma-ray sources (mostly at low frequencies), leading 
to >80 pulsars (Deneva et al., 2021, and references therein). 
The LOTAAS survey, the processing of which is still ongoing, 
also discovered the longest-period (23.5 s) pulsar known until 
recently (Tan et al., 2018b), when a 76-s pulsar was discovered 
with MeerKAT (Caleb et al., 2022). In essence, surveys at low 
frequencies have proven to be highly effective, particularly in 
uncovering the local population of pulsars, and mapping out 
the high-Galactic latitude (b) parts of the sky. 

Surveys at low frequencies offer several benefits but they 
also have their limitations. An appealing factor is the generally 
steep spectral nature of most radio pulsars, where the flux 
density at frequency v is Sy c v™, where a is the spectral 
index. The spectral index is known to vary over a wide range 
for pulsars, -4 < œ < 0, but the average spectral index (o) = 
-1.6 + 0.03 for long-period pulsars (Jankowski et al., 2018), 
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and is somewhat steeper ((x) = —1.9 + 0.1) for MSPs (Toscano 
et al., 1998; Dai et al., 2015), with a 1-o dispersion of —1. 
While this suggests most pulsars are significantly brighter at 
low frequencies, this is more than offset by the even steeper 
dependence of the sky background noise (T, ky X v25), 


S. 
The sky background is also highly direction-dependent and is 
typically significantly reduced toward higher Galactic latitudes. 
The main benefit is the inherently larger fields-of-view of the 
low-frequency telescopes, which substantially increase the 
efficiency in telescope time required and hence the time for 
completion of large surveys. 


Amongst the multitude of other considerations are inter- 
stellar medium (ISM) propagation effects, which tend to ma- 
jorly influence low-frequency pulsar searches; the most fa- 
miliar (and significant) one is the dispersion that manifests 
as frequency-dependent time delays in arrival times Ar œ 
DM v7, where the dispersion measure (DM) is the line-of- 
sight integral of the electron density ne. This non-linear, 
inverse dependence in frequency implies very large delays 
at low frequencies (S200 MHz); e.g., a pulsar with a DM 
= 100 pc cm will have its signal spread over —7.5s in ob- 
servations made over a 30 MHz band centred at 150 MHz, 
as opposed to < 0.1s across a similar (i.e., 20%) fractional 
bandwidth around 1.4 GHz. Circumventing this necessitates 
much finer frequency resolution (Av) so the residual dispersive 
smearing across the finite channel width can be minimised, 
and consequently requires many more channels across the 
recording bandwidth, and hence a much larger data rate and 
substantial processing needs. 


The other significant effect is pulse broadening resulting 
from multipath propagation as a consequence of scattering 
in the ISM, the characteristic time for which is a nonlinear 
function of both DM and frequency, i.e., Ty x DM22 y-44. 
under the assumption of a pure Kolmogorov form of electron 
density spectrum (Cordes et al., 1985). This poses a significant 
limitation in low-frequency pulsar searches, especially when 
the pulse broadening time exceeds the pulsar's spin period, i.e., 
T4 Z P, asit results in a significant degradation or even a loss of 
sensitivity to periodic emission. As with the sky background, 
scatter broadening is also highly line-of-sight dependent; it 
is much larger in the plane, or toward the Galactic Centre, 
compared to high-lbl sight lines. Empirical relations exist to 
guide expected broadening times as a function of DM and 
frequency (e.g., Bhat et al., 2004; Geyer et al., 2017), and can be 
used to guide the observing/search strategies, e.g., Ty 2 100 ms 
at DM 2300 pc cm”, for a line of sight as far off as Ibl ~ 5? 
and ~ 30° away from the Galactic Centre (GC) in longitude. 
This implies, at low frequencies, the search volume is largely 
limited to a few kpc in the plane. However, this is not a 
serious limitation at higher Galactic latitudes, where the DM 
tends to saturate at ~20-50 pc cm” for lbl > 15°. In other 
words, the higher survey speeds of low-frequency surveys can 
be optimally exploited for covering high-lbl parts of the sky, 


without compromising detection sensitivity. 


Yet another relevant ISM effect, especially at low frequen- 
cies, is the modulation of apparent pulsar intensities due to 
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scintillation effects. As with the pulse broadening, the ob- 
servable effects strongly depend on frequency and the line of 
sight, as it is essentially another manifestation of multipath 
propagation. For relatively nearby pulsars (DM <50 pc cm”) 
this often manifests as rapid (and very large) modulations in 
both time and frequency with characteristic scales in the range 
~0.1-5 MHz and ~1-100 min at ~150 MHz; this is diffractive 
scintillation (e.g., Rickett, 1990). Refractive scintillation also 
leads to intensity modulations, but on much longer timescales 
of days to weeks (at low frequencies), and the observed varia- 
tions in mean flux densities can be as much as by a factor ~5-6 
for low to moderate DM pulsars (e.g., Bell et al., 2016; Bhat 
et al., 2018). From the perspective of candidate detection in 
low-frequency searches, this sometimes results in fortuitous 
brightening (or inauspicious dimming) of pulsars, which pro- 
vides the opportunity to detect pulsars that were missed earlier 
(e.g., owing to scintillation dimming), or to detect a pulsar that 
might be below the sensitivity limit of a survey. This further 
strengthens the case for low-frequency surveys. 

Despite these challenges, pulsars were originally discov- 
ered at low frequencies (at 81.5 MHz; Hewish et al. 1968) and 
much of the early years of pulsar astronomy were focused at 
low frequencies. The eventual quest to find them in large 
numbers and timing them at high precision pushed much of 
pulsar astronomy (searches and timing in particular) to frequen- 
cies 21 GHz. However, the advent of several low-frequency 
telescopes over the past decade and advances in affordable high- 
performance computing are effectively leading to a resurgence 
of low frequency astronomy including large sky surveys, many 
of which are conducted at frequencies $500 MHz. 

The success of these northern surveys strongly motivates 
an all-sky pulsar survey with the Murchison Widefield Array 
(MWA) that operates in the 80-300 MHz range in the South- 
ern Hemisphere. The MWA, which was originally built as an 
array of 128 tiles (where each tile is a 4x4 dipole array) with 
a maximum baseline of 3 km, is also Australia’s precursor for 
the low-frequency SKA (i.e., SKA-Low; Tingay et al. 2013). 
Even though the MWA was not initially designed for pul- 
sar science, the eventual addition of a voltage capture system 
(VCS; Tremblay et al. 2015) and the development of software- 
defined instrumentation (for offline processing) equipped it 
as a pulsar-capable facility. Notwithstanding the limitations 
of large data rates (28 TB hr!) and the associated data man- 
agement/processing challenges, the VCS has been exploited 
for wide-ranging science from studies of millisecond pulsars 
to sporadic emission from pulsars (e.g. Bhat et al., 2016; Mey- 
ers et al., 2018; Kaur et al., 2019), and from investigating the 
pulsar emission physics to studying propagation effects caused 
by the interstellar medium (e.g. McSweeney et al., 2017; Kaur 
et al., 2022). The progress in this area, along with the array’s 
upgrade to Phase II (Wayth et al., 2018), whereby a compact 
configuration of 128 tiles within 300 m was possible on a semi- 
regular basis, has made all-sky pulsar searches tractable with 
this telescope. 

The SMART survey described in this paper has two main 
objectives: (1) performing sensitive searches for pulsars and 
fast transients in the sky south of +30° in declination at 140- 


170 MHz; and (2) mapping the sky for low-frequency detec- 
tion of already known pulsars in the southern sky. The main 
novelty of the survey is the use of a voltage-capture mode 
for data recording (as opposed to the filterbank data format 
that has been adopted for all past and ongoing surveys), and 
hence an astonishingly high survey speed for data collection, 
Le., ~500 deg? hr! in 100-us/10-kHz resolutions). However, 
the computational cost of processing (i.e., beamforming and 
searching) are substantial at low frequencies, and thus drive 
the feasible strategies for data processing, especially at early 
stages. 

With the large survey speed substantially reducing the de- 
mand for telescope time for survey completion, longer dwell 
times become affordable, which also increases the sensitivity to 
the detection of sporadic or intermittent class of objects such 
as rotating radio transients (RRATs; McLaughlin et al. 2006), 
intermittent or state-switching pulsars, extreme nullers etc. 
(e.g.. Kerr et al., 2014) among the classes of radio-emitting 
neutron stars, and even the enigmatic fast radio bursts (FRBs; 
e.g., Thornton et al. 2013). The detectability all of these 
transient class of objects is dictated by the *on-sky" time met- 
ric X = OT where Q is the instantaneous FoV and T is the 
time spent on sky (dwell time in the case of an all-sky sur- 
vey). Following the discussion in Sanidas et al. (2019) in the 
context of LOTAAS, Xswanr = 52,735 deg? hr, which is a 
factor of two more than that of LOTAAS for which X1 oTAAs 
= 23,400 deg? hr (at 135 MHz), and indeed much larger than 
XGpgNcc = 1430 deg? hr, XGunss = 835 deg? hr and X40327 
= 132 deg? hr (all at 300-350 MHz). 

Here we present an overview of the Southern-sky MWA 
Rapid Two-metre (SMART) pulsar survey. In § 2 we outline 
the main science goals, and describe the observing strategy 
adopted for sky tessellation. Procedures for data processing and 
analysis are described in § 3, and the strategies for confirmation 
and initial follow-up in § 4. In § 5 we describe the survey 
simulations and the expected yield. Future processing plans 
are outlined in § 6, followed by a summary in § 7. 


2. Survey Description 

2.1 Science goals and Motivation 

The broader goals of the SMART survey are similar to most 
other large sky surveys, i.e., exploring the new parameter space 
that is opened by a leap in instrumentation, technology, or 
sensitivity and to uncover a large population of previously 
undetected pulsars. The fact that the currently known pul- 
sar population (73300, cf. the ATNF pulsar catalogue? v1.67; 
Manchester et al. 2005) represents only a small fraction (K 10%) 
of the total expected (i.e. beamed in our direction) Galactic 
population (e.g., Keane et al., 2015, and references therein) 
strongly motivates such large sky surveys. Indeed, conduct- 
ing a full Galactic census of pulsars is a high-priority science 
objective for the SKA. Further, given the number of broader 
questions surrounding the neutron-star population (e.g. birth 
rates, and comparison with rates of supernovae), the detectable 


*https://www.atnf.csiro.au/research/pulsar/psrcat/ 


pulsar population is largely guided by the known population of 
pulsars at any given time. It is therefore imperative to explore 
every possible avenue and steadily refine our knowledge of pul- 
sar population. Furthermore, the detection prospects of pulsars 
in a given frequency band strongly depends on the emission 
and propagation properties at those frequencies; however, the 
current forecast of a detectable population in the SKA-Low 
band is largely guided by the pulsar population uncovered by 
high-frequency surveys. 

Obtaining a large body of measurements such as DM, 
scattering and Faraday rotation, by using pulsars as probes 
of the ISM, will also enable mapping out the distribution of 
magneto-ionic (and turbulent) plasma in the Galaxy, which 
is steadily refined with a larger sample of measurements (e.g. 
Cordes & Lazio, 2002; Bhat et al., 2004; Deller et al., 2016; 
Yao et al., 2017). 

Finally, an underlying goal of any large-sky pulsar survey 
is to discover exotic objects; while it is hard to design any 
particular survey specifically for this, historical examples are 
abundant, e.g., the discovery of the double pulsar in the Parkes 
multibeam (PMB) survey (Lyne et al., 2004), the 23.5-second 
period pulsar in LOTAAS (Tan et al., 2018b), and the transi- 
tional millisecond pulsar (MSP) in the Arecibo drift-scan sur- 
vey (Archibald et al., 2009). All such broader and high-profile 
objectives are certainly applicable for the SMART survey. 

The SMART pulsar survey also perfectly complements on- 
going northern-sky surveys in sky and frequency coverage 
(Table 1). Surveys at low frequencies will likely be sensitive to 
a different pulsar population, and therefore an all-sky survey 
at low frequencies is also essential to develop a comprehen- 
sive picture of neutron-star/pulsar populations in the Galaxy. 
Bearing this in mind (and as we detail in § 2.4), the survey 
is designed to reach a final sensitivity comparable to that of 
LOTAAS, i.e., the use of long dwell times (4800s) to attain a 
limiting sensitivity (100) of ~2-3 mJy for long-period pulsars 
with small duty cycles, and assuming a spectral index a = -1.5 
and no turnover down to —150 MHz. This is ~3-5 times 
deeper than the previous-generation low-frequency (70 cm) 
survey (Manchester et al., 1996) in the south (and thence an 
accessible search volume ~5-10 times larger), and ~2-3 times 
deeper than the high-latitude segment of the Parkes HTRU 
survey (Keith et al., 2010). 

The SMART survey will also serve as a reference survey for 
future deeper surveys at low frequencies, such as those planned 
with SKA-Low (Keane et al., 2015). While the success of (and 
the lessons learned from) all ongoing low-frequency surveys 
will indeed inform SKA-Low pulsar surveys, the SMART sur- 
vey will potentially play an additional important role, since 
the MWA is also the official low-frequency precursor for SKA- 
Low, and is located at the same site where SKA-Low will be 
built. Specifically, the sky coverage of the SMART survey is 
identical to that of SKA-Low, which means a higher degree 
of synergistic overlap in calibration and beamforming method- 
ologies, than most northern facilities. The role of reference sur- 
veys is vividly demonstrated by the later generation multibeam 
surveys in the south; e.g., the PMB survey for its successors, 
the HTRU pulsar survey (Keith et al., 2010) and the SUrvey 
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for Pulsars and Extragalactic Radio Bursts (SUPERB) (Keane 
et al., 2018), which can now play a similar role for the planned 
surveys with MeerK AT. However, aside from the Parkes 70 cm 
survey of the 1990s, the low-frequency southern-sky remains 
essentially unexplored for pulsar searches, especially at 2300 
MHz. 

Aside from the aforementioned primary science goals, there 
are also some auxiliary goals for the SMART survey, largely en- 
abled by the novelty of the data recording strategy, i.e., the use 
of voltage capture system and post-processing, as opposed to 
the beamformed data in the filterbank format. These not only 
facilitate a number of additional strategies for confirmation 
and follow-up, but they can also be potentially exploited for 
developing and trialling alternate strategies for pulsar searches; 
e.g., image-based techniques for the identification of promis- 
ing candidates that take advantage of pulsar properties such as 
steep-spectrum, variability or circular polarisation (e.g.. Sett 
et al., 2022). These, in principle, also offer some advantages 
over traditional search methods, especially for extreme pul- 
sars like those with sub-millisecond periods, or distant pulsars 
whose pulse shapes will be significantly broadened due to multi- 
path scattering, but will be sensitive primarily to very bright 
sources. 

Notwithstanding the anticipated scientific merits of che 
SMART survey, computational requirements are substantial, 
especially given the large data rate of the VCS and searching at 
low frequencies, thereby necessitating a multi-pass processing 
strategy. In the first-pass processing, we perform a shallow 
survey, where 10 minutes of data from each observation are 
processed, and the search is limited to basic periodicity, and 
DMs up to 250 pe cm. In this paper, we outline the observing 
strategies employed for the survey, and processing strategies 
adopted for the initial phase, and present analysis and results 
to date, as well as plans and strategies for future processing. A 
companion paper (hereafter Paper II) will describe the survey 
status, pulsar census to date and more details on follow-up 
strategies including timing and imaging follow-ups. 


2.2 Survey strategy 


The novel strategy employed for the SMART survey, i.e., the 
use of VCS recording from 128 tiles, which allows high-time 
resolution (and instantaneous) sampling of a very large patch 
of the sky (but at the expense of a large data rate of 28 TB he), 
necessitates substantial processing to enable large-scale pulsar 
searching applications. Most importantly, the voltage data 
from the tiles need to be coherently combined to generate 
thousands of tied-array beams prior to any search process- 
ing. The undertaking of the SMART survey is particularly 
enabled by the Phase II upgrade, whereby a compact con- 
figuration of 128 tiles within ~300 m became available on a 
semi-regular basis. The compact configuration of Phase II 
brings an enormous efficiency in terms of beamforming cost; 
specifically, the number of tied-array (i.e. phased array) beams 
required to fill che full FoV (at a gain level down to half power 
point) is reduced from 2.7 x 10° for the Phase I array to 
3.9 x 10? for the Phase II compact array. This reduction of 
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Figure 1. Sky tessellation of the SMART survey. The left panels show beam tiling patterns for two select pointings: top one a near-zenith pointing (5 = —28?), 
the bottom one a far southern pointing (5 = —70?). The number of tied-array beams vary from ~6000 to ~8000 from near-zenith to far-zenith pointings, and 
the beam shape becomes elliptical at large offsets from the zenith. The size of the circle/ellipse indicates half power tied-array beam size; the red and blue 
circles correspond to the low and high ends of the SMART band (140-170 MHz). The right panels show the primary beam response for the same declination 
pointings, at the central frequency of 155 MHz. 
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Table 1. Parameters of large pulsar surveys over the past decade 


Survey Telescope Frequency Band Sky coverage Time resolution Frequency resolution Dwell time Smin? Reference 
(MHz) (us) (kHz) (s) (mJy) 

LOTAAS . LOFAR 119-151 5>0° 491.52 12.21 3600 1-2 SCB+19 
SMART MWA 140-170 ô < +30° 100 10 4800 2-3 This work 
GBNCC GBT 300-400 5 > —40° 81.92 24 120 1.1 SLR+14 
GHRSS GMRT 306-338 —40° > & > —54? 30.72-61.44 15.625-31.25 900, 1200 1.0 BCM+16 
HTRU Parkes 1182-1522 5 < +30° 390 240, 540,4200 0.2-0.6 KJvS+10 
GPPS FAST 1100-1500 -10° < b < «10? 49.152 244.14 300 0.005 HWW+21 


+ Minimum detectable flux density for a 10-o detection, for long-period pulsars (P >= 0.1 s), with small duty cycle (W/P ~ 0.05), and at DMs £50 pc cm?. 
Notes: Survey description reference - SCB+19: Sanidas et al. (2019) for LOTAAS; SLR+14: Stovall et al. (2014) for GBNCC; BCM+16: Bhattacharyya et al. (2016) 
for GHRSS; KJvS+10: Keith et al. (2010) for HTRU; HWW+21: Han et al. (2021) for GPPS 


more than two orders of magnitude in the computational cost 
makes an all-sky high-sensitivity pulsar search tractable (and 
affordable) with an interferometric array like the MWA. Thus, 
with the beamforming step integrated into software-defined 
instrumentation, this effectively translates into an impressively 
large survey speed of ~450 deg? hr! , i.e., the full visible sky 
of the MWA (8 « +30°) can be surveyed in a modest number 
of VCS pointings. 

The first-pass survey strategy of processing only 10 minutes 
of data from each observation (hence reaching about one- 
third of the full-search sensitivity) was adopted also to boost 
the prospects of early pulsar discoveries. Even though the 
combination of the VCS mode and the FoV provides a large 
survey speed, practical considerations such as the availability 
of the compact configuration necessitated multiple observing 
campaigns to advance the survey. Further details including 
the survey status and completion plans are described in Paper 
II. 


2.3 Beamforming and Sky tessellation 


The signal processing chain of the MWA including the high 
time resolution system is described in a number of earlier 
papers (e.g., Tingay et al., 2013; Prabu et al., 2015; Tremblay 
et al., 2015), and is briefly reiterated here. In the legacy system 
that was employed for survey campaigns to date, the VCS 
sub-system follows the second stage of channelisation in the 
signal path. Each element of the array is a 4 x 4 dipole array, 
called a “tile”, the signals from which are fed to an analogue 
beamformer that defines the FoV. The beamformed signals are 
Nyquist-sampled at 655.56 Msps and channelised (after signal 
conditioning) using a polyphase filterbank (PFB) to generate 
256 x 1.28-MHz signal outputs (i.e., coarse channelisation), 
24 of which are transported to the central processing facility, 
where a second-stage PFB operation is performed, resulting 
in 128 x 10-kHz time series for each coarse channel, i.e., 3072 
channels across the recording 30.72 MHz bandwidth. These 
voltage time series are written to an array of RAID disks by 
the VCS as 4+4-bit complex voltage samples. These data are 
recorded (up to a maximum duration of 100 minutes) and 
transported to the Pawsey Supercomputing Centre where 
further processing (including calibration and beamforming) is 
carried out. 


VCS-recorded data can be processed offline for calibration 
and tied-array beamforming (Ord et al., 2019) and, optionally, 
can also be reprocessed to reconstruct a higher time resolution 
voltage data at the native coarse channel resolution of 0.78 us 
(McSweeney et al., 2020). To realise the SMART pulsar survey, 
this beamformer functionality was further enhanced to option- 
ally generate several dozens of tied-array beamformed outputs 
simultaneously — i.e. the so-called multi-pixel beamformer, 
which is essentially the front-end of the pulsar search process- 
ing chain. The implementation details and benchmarks are 
described in Swainston et al. (2022). This software tied-array 
beamformer has been benchmarked on Pawsey's Garrawarla 
and Swinburne's OzSTAR supercomputers. It performs 3x 
faster on the latter, which has been the primary high perfor- 
mance computing (HPC) platform for much of our SMART 
data processing. 

Thanks to the large field-of-view of the MWA (~610 deg? at 
155 MHz, near zenith), the entire sky south of declination 
5 < +30° can be covered in a modest number of telescope 
pointings. The sky tiling strategy is shown in Fig. 1. In 
short, we adopted pointings similar to that of the GaLactic and 
Extragalactic All-sky MWA survey (Wayth et al., 2015), i.e., 
meridian drift scans optimised for maximum sensitivity at each 
declination as well as for more reliable calibration (referred to 
as ‘sweet spots’). In this case, the number of pointings depends 
on the degree of overlap in right ascension (RA), with a mini- 
mum of 58 pointings for minimal (1°) overlap and 78 for a 15? 
overlap. A large overlap is more optimal as it effectively serves 
as a two-pass strategy, which is desirable at low frequencies 
where intermittency (from effects such as scintillation) tends 
to be more pronounced. After exploring the full range of op- 
tions, and also factoring in the available resource constraints, 


we converged on a 10° overlap as an acceptable choice. As 
shown in Fig. 1, this amounts to a total 70 pointings, i.e., 93 hr 
of telescope time for the full SMART survey. 

For each pointing, many tied-array beams (TABs) are 
formed to maximise the sensitivity across the FoV. The tied- 
array beams are pointed towards fixed right ascension and 
declination, with the necessary adjustments to the tile phases 


bThe operational constraints of the MWA limited VCS mode observations 
to a maximum of 25 hours per observing semester, with the legacy system. 
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made for every second of data (Ord et al., 2019). Thus, al- 
though the observations themselves are drift scans, sources can 
be tracked by the same TAB for up to the full duration of the 
observation. 

The precise size and shape of the TABs is a non-trivial 
function of the tile layout of the compact configuration, equiv- 
alent to the “Compact robust 1” synthesised beam whose cross 
section is presented in Figure 7 of Wayth et al. (2018) and 
discussed in Swainston et al. (2022) and in Section §3. Due to 
the compact configuration’s redundant baselines (in the two 
“hexes”), the most sensitive parts of the TAB consist of a main 
lobe whose full width half maximum (FWHM) at 155 MHz is 
23’, surrounded by a pattern of discrete grating lobes of similar 
width. Although these grating lobes can be exploited for can- 
didate confirmation (further discussed below), we choose the 
TAB pointings to form a dense (hexagonal) grid such that the 
main lobes overlap by ~20%, as shown in Fig. 1. This effec- 
tively Nyquist-samples the sky at a gain of the half-power level 
or more. The beam shape used for this calculation assumes that 
all 128 tiles are functioning, whereas, in reality, up to ~10% 
of tiles may be flagged in any given observation. Unless the 
flagged tiles preferentially result in a reduction of the longest 
baselines, the effect on the beam shape is negligible. 

Tiling the FoV in this way translates to ~6300 TABs for 
an observation pointed toward the zenith. For pointings away 
from the zenith, where the beam shape develops a significant 
ellipticity (e.g., at zenith angle 15°, ellipticity € = 9 inaj/O min = 
1.36 where Omaj and Omin are the major and minor axes of 
the TAB), the number of TAB pointings are in the range 
~4200-4500. Further, the beam size also varies across the 20% 
fractional bandwidth of our survey observations; for example, 
for a pointing toward the zenith (where the TAB is nearly 
symmetrical), the FWHM is 25.3’ at 140 MHz but reduces to 
20.7 at 171 MHz. This further justifies our rationale for a 20% 
overlap, as it ensures every single spot in the sky is covered at 
a gain near or above the half power level even at the high end 
of the observing band. 

Finally, as with any other aperture array, the sensitivity is 
not uniform across the sky and is strongly declination depen- 
dent; to first order, the loss in sensitivity is by a factor cos(0..) 
where Oz is the zenith angle. In principle, this can be com- 
pensated to a certain extent by longer integrations, though in 
practice, the inherent limitations of our data recording system 
(VCS) limits this to no more than 90 minutes, and we there- 
fore use 80 minute recordings for all pointings. As such, the 
sensitivity will not be uniform across the sky due other factors; 
e.g., the sky background temperature Ty is direction depen- 
dent, and the loss in sensitivity from severe pulse broadening 
for distant pulsars, which applies to the sight lines within the 
Galactic plane or toward the Galactic centre. Some of these 
are considered in detail in $ 2.4. 


2.4 Survey sensitivity 

The sensitivity of a pulsar survey is determined by the com- 
bination of some instrumental and processing parameters and 
a variety of broadening effects to pulsar signals. Following 


Dewey et al. (1985), the minimum detectable flux density for 
a pulsar with period P and effective pulse width Wg, down 
to detection significance (S/N) min, i.e., minimum detectable 
signal-to-noise ratio, is related to the telescope gain G and 
system temperature Tsys, which is the sum of the receiver and 
sky background temperatures, i.e., 

(S/N) min(Trecv E T sky) W. &g 


e 


min 7 
G Myolfobs Bobs P- Weg 


(1) 


where n,o] is the number of polarisations summed, tops is the 


O.: 
SN dd time and Bobs is the recording bandwidth. As 
evident from this equation, the sensitivity is maximum for long- 
period pulsars with a small duty cycle, i.e. when Weg < P. 
The gain G = A,¢/2kp, where Aeg is the effective collecting 
area and kg is the Boltzmann constant. At 150 MHz, A,g © 
2750 m? for a 128-tile MWA (Tingay et al., 2013), which may 
imply G ~ 1KJy! , however, for an aperture array such 
as the MWA, it is a strong function of the zenith angle, i.e. 
G(9z) = Gmaxcos(0.), where 0. is the zenith angle and Gmax is 
the gain at 0. = 0. Moreover, for drift-scan type observations 
that we employ for the SMART, G depends on the offset from 
the phase centre, and can be —0.5 Gmax at the half power 
point. We therefore assume a conservative G ^ 0.5 KJy for 
all our sensitivity calculations. This is assuming a full coherent 
beam sensitivity, i.e., perfect calibration for TAB formation 
and no loss of sensitivity due to flagged tiles. In practice, 
a small number of tiles (S10) are typically flagged due to 
malfunctioning, sub-optimal performance or poor calibration 
solution. As we detail in $3, the strategy of observing multiple 
calibrators for SMART observation allows us to perform useful 
cross-checks and maximise the achievable sensitivity using the 
best available calibration solutions. 

At the low frequencies of the MWA, the system temper- 
ature Tys is dominated by the sky background Ty. Both 
Trecy and Tay are frequency dependent, and T sky is also a 
strong function of the direction (I, b), where / and b are the 
Galactic longitude and latitude, respectively. We assume a 
mean Trecv= 50K for the 140-170 MHz band. Excluding a 
~10° cone around the Galactic centre, Tas can vary from 
~200 K toward lbl 7 60° to as much as ~1200 K in the plane, 
toward 210? from the Galactic centre, where T, can be as 


large as ~ 10* K at 155 MHz. We use the Haslam et al. (1982) 
map as the reference and assume Toky o y 255 scaling from 


Lawson et al. (1987). Given this strong dependence of T sky 


with (I, b), we consider two cases: (1) the sky at Ibl < 5° where 
mean Ty, ~ 600K and (2) the sky at lbl = 5°, where mean 
T ~ 270K; i.e., Tsys = 630K and 300K, respectively, as 
shown in Fig. 2. 

Intrinsic pulses are broadened due to a variety of effects, 
as discussed earlier. As detailed in Lorimer & Kramer (2012), 
the total smearing time Trot is the quadratic sum of the finite 
sampling time Tsamp; the residual dispersive smearing due to 
finite frequency channel Tohan: the dispersive smearing across 
the full recording band due to finite DM steps in trial DM 
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Figure 2. Left: Minimum detectable flux density, Smin, for the first-pass processing of the SMART survey as a function of DM. Sensitivity limits, assuming 
a 10-minute integration time, are plotted for different pulse periods, P = 1.0, 0.1, 0.01, 0.001 s, and for two different system temperature values Tsys; one 
corresponding to mean Tk, for regions away from the Galactic plane, and the other for a mean Ty, in the plane, but excluding the region toward the Galactic 
Centre. The effect of pulse broadening due to interstellar scattering (Bhat at el. 2004) is shown by the dotted lines. Right: Pulse broadening (smearing) incurred 
by using the first-pass processing dedispersion plan (Table 2) due to various factors such as the finite sampling time, dispersive smearing due to the incoherent 
de-dispersion algorithm used, and the effects of multi-path scattering based on the T,-DM relation from Bhat et al. (2004). The grey shaded region denotes 


one order-of-magnitude larger or smaller range in the predicted scattering. 


values Tgm» and the dispersive smearing resulting from piece- 
wise linear approximation of the quadratic dispersion law in the 
sub-band dedispersion algorithm employed in searches T up- 
Fig. 2 summarises these for our current first-pass processing. 
The planned second-pass search will significantly enhance the 
search sensitivity by processing the full observation (4800 s) 
and the use of more optimal DM steps, i.e. many more trial 
DMs than that used in current search. 


As evident from the figure, for our current first-pass pro- 
cessing, the total smearing time is dominated by finite DM 
steps; this sub-optimal choice was made in an effort to max- 
imise the number of observations that can be processed to 
completion toward a shallow all-sky survey within available 
computing resources. The dedispersion plan utility used is 
shown in Table 2. In effect, we progressively downsample 
the data five times over the DM range searched, each time 
making the DM step size coarser. At DM 2 3pc cm^, the 
dispersive smearing time within the 10-kHz channel is larger 
than the native sampling time (100 us) but still a smaller contri- 
bution to the total smearing time, compared to that due to the 
DM step size. As a result, the net smearing time Tot displays 
a step-wise increase as shown in Fig. 2, piven our dedisper- 
sion plan. At very low DMs £10 pc cm ?, Tiot ~ 0.7 ms but 
increases to ~10 ms at DM ~100 pc cm. In essence, our first- 
pass search severely compromises the sensitivity to millisecond 
pulsars (MSPs) at larger DMs and shorter periods, i.e., it is 
currently sensitive to MSPs at DM 30 pc cm^?and P 210 ms. 
As shown in Fig. 2, at those larger DMs, the smearing due 
to scattering (i.e. pulse broadening) can also be significant. 
The broadening time here is based on the empirical relation 
in Bhat et al. (2004), which is mostly relevant for pulsars near 
the plane. As is well known, these scattering estimates can be 


uncertain by more than an order of magnitude, denoted by 


the grey shaded region. 


Table 2. Dedispersion plan for the first-pass SMART processing 


DMmin  DMinax 5DM Nom d, Akg 
(pcem™) — (pecm™) — (pecm?) (ms) 
1.0 122 0.02 560 1 ol 
12.2 244 0.03 406 2 02 
244 48.8 0.06 406 4 04 
48.8 97.6 0.11 443 8 08 
97.6 195.2 0.23 424 16 16 
195.2 250.0 0.46 119 16 32 


The columns 1 and 2 denote the ranges in dispersion mea- 
sure, between DMmin and DMmax, with a DM step-size of 
5DM, resulting in Np» trial DM values. The down sam- 
pling factor is denoted by d,, i.e. the factor which the tem- 
poral resolution is averaged to yield a net resolution Akg- 


The theoretical sensitivity is shown in Fig. 2 for different 
periods, P = 1.0, 0.1, 0.01 and 0.001 s. In all these calculations, 
we have assumed a duty cycle of 396, i.e., Wag/P = 0.03. For 
each period, a pair of curves are shown: one for the best-case 
scenario, i.e., searches away from the plane, where Tys ~ 
305 K; and the second for the sky near the plane where the 
mean Tyys is twice as high. In either case, the sensitivity is 
maximum for long-period pulsars, at low to moderate DMs of 
S50 pc cm? and toward Ibl > 5°. 

With our first-pass processing scheme (i.e., 10-minute 
integrations and a sub-optimal dedispersion plan), we reach a 
limiting sensitivity of Smin ~ 7-12 mJy for long-period pulsars, 
and ~ 12-25 mJy for MSPs at low to moderate DMs. For the 
proposed deep-pass processing (i.e., ~80-minute integrations 
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and a more granular dedispersion plan), we can achieve a 
limiting sensitivity of Smin ~ 2-3 mJy for long-period pulsars 
and ~ 5-10 mJy for MSPs at low to moderate DMs. In this 
case, the SMART survey sensitivity is comparable to that of the 
LOTAAS survey in the northern hemisphere. While LOTAAS 
can be twice as sensitive as SMART for long-period pulsars, 
the sensitivity for P S10 ms is almost similar, owing to a lower 
degradation in sensitivity in the SMART band. Compared to 
the Southern Pulsar Survey of the 1990s at 430 MHz (i.e., a 
wavelength of 70 cm), also known as the Parkes 70cm survey 
(Manchester et al., 1996), the SMART survey is ~3-5 times 
more sensitive, especially for pulsars at DMs S100 pc cm^?and 
spectral index a £ —1.5. Even the ongoing shallow survey 
is comparable to the 70cm survey in theoretical sensitivity, 
and if at all, slightly more sensitive for steep spectrum pulsars 
with no turnover down to ~100 MHz. This provides a strong 
motivation to undertake a full-scale pulsar survey with the 
MWA. 


2.5 Effective dwell time and sensitivity 


Unlike most other pulsar surveys, where single-dish telescopes 
are used to track targeted positions for small time intervals (e.g., 
HTRU, GBNCC), the SMART observations are drift scans, 
where the primary beam is pointable but static in horizontal 
coordinates (azimuth and zenith angle) once an observation 
starts and the sky moves through the FoV. When forming 
TABs, we track the sky position as it moves through the MWA 
primary beam and as a consequence not all TABs necessarily 
remain within a sensitive part of the primary beam for the full 
80-minute duration. 

The amount of time spent within an individual observa- 
tion FWHM depends both on the observing declination (i.e., 
where the primary beam is pointed) and the target source 
position to be tracked with a TAB. As an example, in Fig. 3 
we plot some representative TAB pointings along with the 
primary beam response for the same observations as in Fig. 1 
in horizontal coordinates. As already noted, our sensitivity 
drops substantially as we observe at larger zenith angles, which 
we visualise by having the colour scale represent the zenith- 
normalised primary beam power as a proxy for sensitivity. 
Secondly, the TAB pointing directions are traced before, dur- 
ing, and after the 80-minute observation, which highlights 
that not all targeted positions remain in a usable part of the 
primary beam. These effects highlight at least three points 
for consideration: (1) it will be an inefficient use of resources 
to track certain pointings for the full observing duration, (2) 
tracking pointings naively for the full duration, especially if a 
significant fraction of the time is spent below the 1096 power 
point, may actually reduce sensitivity to pulsars, and (3) the 
full-sky sensitivity will be patchy regardless of TAB forming 
strategies (although this is partly mitigated by having observa- 
tions overlap by ~20% at the central frequency). To address 
(1) we can estimate the time a source remains in a reasonable 
power range of the primary beam and only form TABs from 
the appropriate subset of voltages recorded (e.g., while the 
target source is not in a null). For (2) we must strike a careful 
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Figure 3. Tied-array beam traces through the MWA primary beam for SMART 
observations. Three example pointing directions for each observation are 
traced including 1 hr before and 1 hr after the 80-minute observation. The 
target trace (rotating clockwise as time advances) is coloured pink to repre- 
sent the trajectory before the observation, red during the observation, and 
blue after the observation is complete. North is at 0° and the azimuth angle 
increases to the East. The colour scales are the same for each subplot, high- 
lighting the sensitivity penalty incurred for observing away from zenith. 


balance between achieving maximal sensitivity (by cutting 
off parts of the TAB) and dwell time (which benefits searches 
for longer-period pulsars and single pulse events). The conse- 
quence of (3) is unavoidable given the telescope configuration 
and observing strategy employed, but is quantifiable. 

We can evaluate the relative sensitivity (assuming a 80- 
minute track for a given TAB) by summing the primary beam 
response power at discrete time steps, where we use our current 
best Full Embedded Element (FEE) model (Sokolowski et al., 
2017), normalised to the equivalently summed power that 
would represent the best possible dwell time and sensitivity 
combination. For our purposes we define this quantity as 
the sum of the primary beam power at zenith for the full 
observing duration (i.e., imagining we can track an equatorial 
position with full zenith sensitivity). This is useful as it scales 
the effective sensitivity to a quantity close to what a single-dish 
steerable telescope could achieve. In Fig. 4 we present these 
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Figure 4. Effective sensitivity maps, assuming a full 80-minute tracking and 
integration for a given TAB sky position. The colour map is normalised to 
the best possible sensitivity (described in the text), and contours at 25, 50, 
and 75 per cent are drawn for clarity. Due to the drift scan nature of the ob- 
servations versus the tracking TABs, we can never achieve the best possible 
sensitivity. Right Ascension and Declination are marked by the vertical and 
horizontal curved grid lines, respectively. 


effective sensitivity maps, in equatorial coordinates, for the 
same example observations used in Fig. 3. 


3. Data processing and analysis 

In terms of data collection and processing requirements, the 
SMART survey is the largest all-sky pulsar survey undertaken 
in the southern hemisphere, and is only the second largest after 
LOTAAS. The SMART survey will accrue ~3 PB of VCS data, 
compared to ~1 PB (search mode data) by the highly success- 
ful Parkes HTRU survey, and ~8 PB (beamformed data) by 
LOTAAS. As outlined earlier, the survey will cover the sky in 
70 VCS pointings, each VCS observation being 4800s (42 TB). 
The management and processing of this volume of data is non- 
trivial, particularly considering the computational resources 
currently available. The processing software and pipelines are 
developed, tested and benchmarked on Pawsey's Galaxy/Gar- 
rawarla clusters, and subsequently ported and benchmarked on 
Swinburne’s OzSTAR supercomputer. The time on OzSTAR 
is secured via the merit allocation scheme under Astronomy 
and Supercomputer time allocation, and is typically 0.5-0.6 
million service units (CPU core) hours per annum. These con- 
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straints largely drive the initial processing strategies, thereby 
necessitating a first-pass shallow survey. 

Compared to the HPC resources available at Pawsey, the 
processing efficiency has been relatively higher on OzSTAR, 
where the current benchmarks are 2 kSU for beamforming and 
25 kSU for searching a 10-min observation (4.4 TB), where 
1 kSU - 1000 service units (CPU core hours). The current 
allocation thus allows processing of 9 observations (fields) per 
semester, where each 10-min VCS observation is processed 
for ~6000 tied-array beams, each of which is then searched 
in 2358 trial DMs, out to 250 pc cm^?. The completion of 
first-pass processing will thus require ~2 million core hours. 
Scaling from the current benchmarks, we would thus expect 
1,500 kSU per full observation for deeper searches, and 60 
million core hours for full DM searches (~10,000 searches, for 
a max DM of 250 pc cm), necessitating the integration of 
GPU-based search processing in the future. 

An overview of the processing pipeline is presented in 
Fig. 5, the details of which are described in the sections below. 
In essence, this involves preprocessing and beamforming of 
voltage data from 128 tiles ofthe array to generate beamformed 
time series, before the data can be processed through the search 
and detection pipelines. The main steps are outlined below. 


3.1 Pre-processing and Beamforming 


The main step in the preprocessing stage involves processing 
VCS data so they can be calibrated and coherently combined 
to produce beamformed time series at the native resolution 
of 100-1s/10-kHz of the VCS. The array calibration is per- 
formed using one of the standard calibrators (e.g., 3C444), 
recorded in the visibility mode at the default 0.5-s/40-kHz res- 
olution, where complex gain solutions (amplitude and phase) 
are obtained for each of the 128 tiles, for every coarse channel 
(1.28 MHz wide), using the Real Time System (RTS) soft- 
ware package. The procedure is essentially similar to those 
employed for other VCS observations (e.g., Swainston et al., 
2021). The calibration solutions can then be used to coher- 
ently combine the voltage data in phase using the tied-array 
beamformer, the conceptual details and implementation of 
which are detailed in Ord et al. (2019). The functionality was 
enhanced, and GPU parallelised, in preparation for SMART 
data processing (Swainston et al., 2022). 

The beamformed data are written as Stokes I at 100-s/10- 
kHz resolutions. The current implementation allows process- 
ing 120 coarse-channel beams at once, i.e., 5 full-bandwidth 
(30.72 MHz) beams, resulting in a data rate of 87 GB/beam 
for a 10-minute observation. For each survey pointing, this 
amounts to ~500 TB in beam-formed data. These data are 
equivalent to that would emerge from the standard pulsar 
backends and so can be processed using standard pulsar search 
packages. Typically, data would be processed to generate RFI 
masks; however the superb radio-quiet environment at the 
telescope site and preferential observing during the nightly 
hours (and within an hour of the source transit) make this 
step not essential for the SMART data. In most cases, data are 
minimally affected by RFI, and consequently no RFI-related 
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Figure 5. Workflow diagram illustrating the first-pass SMART processing pipeline: voltage data at 100- us/10-kHz resolutions are recorded from 128 tiles of the 
array after tile beamforming and channelisation stages, and are subsequently ported to the Pawsey supercomputer where the initial processing including 
calibration, beamforming and known pulsar detections are carried out. Search processing is currently performed on the OzSTAR supercomputer, and is 


limited to basic periodicity searches. 


processing is carried out in the ongoing first-pass processing. 

The large FoV of the MWA means excellent prospects for 
detecting multiple known pulsars within each pointing, which 
is also important for crucial data quality checks and initial 
assessment of array calibration and tied-array sensitivity. In 
short, each SMART observation is processed for known pulsars 
within the primary beam (~610 deg’), using a custom pulsar 
detection pipeline. 


3.2 Search pipeline 


The current SMART pipeline includes a GPU-based pipeline 
for front-end processing (beamforming) and a CPU-based 
pipeline for downstream (search) processing. The search pipeline 
is based on the Pulsar Exploration and Search Toolkit (PRESTO; 
Ransom, 2001, 2011) pulsar search software suite, with the 
addition of machine-learning (ML) tools adopted from the 
LOTAAS classifier (Tan et al., 2018b). This was adopted as a 
first-pass processing strategy, to ensure an end-to-end working 
pipeline from the data collection and reordering stage (occur- 
ing at the observatory site) to array calibration/quality checks 
(Pawsey) and search processing (OzSTAR). To encapsulate 
the full search workflow, we make extensive use of Nextflow? 
(Di Tommaso et al., 2017) to manage data input, output, pro- 
cessing tasks, and intermediate or final product creation and 
tracking. 

In the near future, as we transition to full-sensitivity searches, 
the search component will be replaced by a GPU-based im- 
plementation. Here we present a detailed breakdown of the 
current SMART search pipeline, where 10-min data (4.8 TB) 
are processed from each observation. 


*See https://github.com/scottransom/presto 
dSee https://github.com/nextflow-io/nextflow 


3.2.1 Dedispersion and periodicity search 


The beamformed data are processed to create dedispersed time 
series for each beam. As mentioned earlier, for the first-pass 
processing, maximum DM searched is 250 pc cm”. At higher 
DMs, scattering can be significant; e.g., pulse broadening 
times 2 100 ms are expected at 155 MHz for sight lines to- 
ward Ibl < 5°, and | 2 330? or | S; 30°, where such high 
DMs can be expected. Further, even with 10-kHz channels, 
DM smearing can still be significant at low frequencies. For 
instance, at a frequency of 140 MHz (i.e., the low end of the 
SMART band), intra-channel dispersion smearing is ~1.5 ms 
at DM = 50 pc cm^, and ~10 ms at DM ~ 250 pe cm^, The 
dedispersion plan was created using the PRESTO DDp1lan. py 
utility, but with the caveat that sub-optimal settings were cho- 
sen (the use of coarser DM steps) to limit the number of DM 
trials to 2358, given the limitation of computational resources. 
The prepsubband tool from PRESTO was used to create 
incoherently dedispersed time series from the PSRFITS (i.e., 
search mode) files. It makes use of the sub-band dedispersion 
technique, which uses a piece-wise linear approximation to the 
quadratic dispersion relation. The dedispersion plan employed 
in the first-pass search is shown in Table 2. 

Searching for periodic signals involves computing the power 
spectra of the dedispersed time series, which is performed us- 
ing the realfft tool within PRESTO, by applying Fourier 
transform techniques. These power spectra are then searched 
for periodicities using accelsearch (Ransom et al., 2002), 
which detects the most significant periodic signals and uses 
harmonic summing to recover the power spectra at multiples 
of a given spin frequency. No acceleration searches are per- 
formed in this first pass; i.e., searches are only performed at 
zero acceleration. Acceleration searches would require signifi- 
cant processing cost, given the large data rates, and the number 
of trial DMs required, but will be part ofthe second pass search. 


12 


If the significance of any spectral bin is in excess of 2o, it is 
marked as a candidate and the corresponding harmonics up to 
the 16th are summed to increase the detection significance. 

A sifting procedure is then performed on the list of can- 
didates from all 2358 DM trials. We adopt a fairly standard 
procedure, quite similar to that followed for LOTAAS, where 
candidates with P < 1ms or P > 30s are rejected,° as well as 
those with DM < 1pc cm". Candidates with similar DMs 
and harmonically related periods are then grouped, and only 
the instance with the highest S/N is kept. From this reduced 
candidate list, only those with 756 detections are then folded. 


3.2.0 Candidate folding 

Folding of the candidates is performed using the prepfold 
tool, which creates the associated candidate files and standard 
diagnostic plots such as those shown in Fig. 6. Since our 
pipeline uses the LOTAAS classfier, the folding analysis is 
carried out using the identical parameter setup as in the LOFAR 
search pipeline; i.e., 100 pulse phase bins, 256 sub-bands, 120 
sub-integrations for P > 10 ms, whereas 50 pulse phase bins 
and 40 sub-integrations for P < 10 ms. With this, the folded 
candidate information can be classified and processed using the 
ML classifier that we have adopted from the LOFAR search. 


3.2.3 Single pulse search 

Single pulse searches have proven to be effective for detecting 
the class of pulsars that emit sporadically (e.g., RRATs, and 
giant-pulse emitters such as the Crab). The basic algorithm 
involves trialling a range of box car widths, 2” tsamps where 
tsamp is the sampling time resolution (100 us for SMART) and 
n = 0,1,2,...,N, where N corresponds to the maximum 
width searched (e.g., Cordes & McLaughlin, 2003), and de- 
tecting ‘events’ that are above a set threshold. It is not computa- 
tionally demanding, and is routinely performed in most pulsar 
searching. The pipeline has been tested using a SMART obser- 
vation containing the Crab pulsar, and has also yielded a blind 
detection ofa LOFAR-detected RRAT J0301+20 (Michilli et al., 
2018). Integrating this into the processing chain is part of our 
second-pass search strategy. 


3.3 Pulsar detection pipeline 

3.3.1 ML classification of candidates 

For each VCS survey pointing (~610 deg? , which is tessel- 
lated to 46000-8000 beams), the processing typically results in 
~135,000 candidates. Scaling for a significantly larger sensitiv- 
ity (3x) and a larger number of DM trials (~ 4x) anticipated 
in full-scale deep searches in the second pass, we may expect 
over 50 million candidates. Even for the first pass, as many as 
9 million candidates can be expected, extrapolating the rate of 
candidates requiring scrutiny from the current pipeline. In- 
deed, visual inspection of that many candidates is unrealistic, 
thus necessitating the use of ML classifiers. 


*'This period range was adopted given the minimum and maximum pe- 
riod of known pulsars in the ATNF pulsar catalogue when our processing 
commenced, which was 1.3 ms and 23.5 s, respectively. 
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As an initial strategy, we have adopted the ML software 
that was developed for LOTAAS. The algorithm used is de- 
scribed in Lyon et al. (2016) and Tan et al. (20183), and is 
summarised here. The classifiers use the statistics of the pulse 
profile (i.e., mean, variance, skewness and kurtosis) and the 
DM curve (i.e., S/N vs DM; see Fig. 6). As described in Tan 
et al. (20183), this basic approach is expanded by also calcu- 
lating the correlation coefficient between each sub-band of 
the profile, as well as correlation coefficients between each 
sub-integration and the profile. In effect, the classifier uses 
the statistics of correlation coefficient distributions, in addi- 
tion to the statistics of the profile and the DM curve, in order 
to classify the periodicity candidates. Four standard models 
are used for the regression: (1) decision tree algorithm, (2) 
multilayer perceptron, (3) probabilistic Bayes classifier, and (4) 
linear support vector machine. 

Even without being trained on MWA data, the software 
performs reasonably well, with a recall rate of ~83% for the 
worst-performing regression model. While clearly not opti- 
mised for a MWA search, it can still provide a significant cull 
on the number of candidates that require human scrutiny as 
long as the number of false negatives is kept below an accept- 
able threshold. To minimise the false negative rate, we use 
the provided “ensemble” classifier, which labels candidates as 
positive if at least three models classify chem as positive. Under 
this criterion, the number of candidates is cut down from the 
original ~135,000 per pointing down to ~20,000 that require 
human scrutiny, i.e., an efficiency of ~85%. The false negative 
rate can be lowered further by allowing candidates classified as 
pulsars by a smaller number of regression models to be passed, 
but this comes at the cost of also lowering the efficiency. For 
the first-pass processing, we find the current arrangement to 
be an acceptable compromise, but will be implementing an 
improved ML classifier for the second pass. 

Of the remaining ~20,000 candidates per pointing, only 
a small fraction are true pulsar detections, with the vast ma- 
jority of candidates consisting of noise and RFI. Here, we are 
extending the definition of RFI to include any artefact from 
the MWA signal path that may result in spurious detections. 
Owing to the radio-quietness of the observatory site, such 
candidates belong almost exclusively to this category, and al- 
most never arise from external sources. The most common 
RFI candidates are those with periods of either 1 second or 
with a close harmonic relationship (e.g.. 0.55, 25), relating 
to the division of data packets by 1-second boundaries. Such 
candidates are sufficiently few (and easily identified) that we do 
not apply any automatic procedure for removing them from 
our pool of candidates. 


3.3.2 Prioritisation and scrutiny of candidates 


The candidates that survive the initial ML cull are still mostly 
dominated by noise and RFI detections, with only a small 
minority being true pulsar detections. Although all of these 
candidates are intended ultimately to be visually inspected, we 
have developed a so-called *clustering" algorithm to prioritise 
which candidates get inspected first, in order to accelerate the 
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Figure 6. Examples of standard PRESTO diagnostic plots of original periodic pulsar candidate detections (left panels), and improved detection plots from 
follow-up processing for confirmation (right panels). Upper panels are the first pulsar discovered from the SMART, PSR J0036—1033, and the lower panels 
are the second pulsar, PSR J0026-1955. Initial detections are from 10-min observing durations (first-pass processing), while the confirmation ones are from 
longer durations of the same initial detection observations. 


signals will be inspected firsc.f 
Candidates are considered similar if 


detection of sufficiently bright, new pulsars. 


The clustering algorithm leverages the fact that the tied- 
array beam of the MWA’s compact configuration is relatively 
complex, with significant grating lobes located in different 
parts of the primary beam. Because the spacing between tied- 
array pointings is equal to the FWHM of the main lobe of 
the tied-array beam, any sufficiently bright pulsar will likely 
be detected in multiple beams. For instance, Fig. 7 shows a 
map of multiple detections of PSR B2327-20 superimposed on 
the theoretical sensitivity of each tied-array beam towards the 
pulsar, as predicted by the array factor formalism developed for 
the MWA by Meyers et al. (2017). Since noise candidates will 
not be correlated across different beams, prioritising similar 
candidates that appear in multiple beams dramatically increases 
the likelihood that candidates representing true astrophysical 


1. they appear in at least two adjacent beams, 
2. they have periods within 0.5% of each other, and 
3. they have DMs within 3 pc cm7 of each other. 


As a demonstration of the usefulness of the clustering al- 
gorithm, we show how it would detect PSR J0026-1955, the 
second pulsar discovery in the SMART survey (McSweeney 
et al., 2022). In reality, the clustering algorithm was not im- 
plemented until after PSR J0026-1955 was discovered, but it 
is interesting to note that the first detection (chronologically) 


This is counter intuitive to the case of multi-beam surveys with Parkes- 
like single-dish telescopes, where similar candidates detected in multiple beams 
across the sky would indicate RFI. 
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Figure 7. The theoretical array factor (a proxy for sensitivity) of each tied-array beam towards the pulsar B2327-20, with the red cross marking the position 
of the pulsar (left panel) and the beams in which the pulsar was detected (right panel). SMART observation 1226062160 was used for the demonstration. 


of this pulsar was a grating lobe detection (at the time, the 
candidates were being served up randomly), which motivated 
the development of the clustering algorithm in the first place. 

The final set of detections of PSR J0026-1955 is shown in 
Fig. 8, ona backdrop of the theoretical array factor (a proxy for 
sensitivity) towards the pulsar assuming that our current best- 
fit position is correct. In this case, three of the search beams 
contained the nominal pulsar position in the main lobe, while 
several others positioned the pulsar in their respective grating 
lobes. All of the displayed detections meet the second and third 
clustering criteria (similar periods and DMs). Therefore, any 
pair of detections in the same or adjacent beams are considered 
“clusters”, and if the clustering algorithm was in use when this 
observation was processed, this pulsar would have been picked 
up immediately in multiple clusters. 

The clustering algorithm offers no advantage for relatively 
weak pulsars that would be detected only in a single (boresight) 
tied-array beam. Therefore, unclustered candidates are not 
deleted, only deprioritised. 


3.3.3 Human inspection and ranking 

Just as the clustering algorithm is a method for prioritising 
candidates for human inspection, so too is human inspection 
a method for prioritising candidates for follow-up (see §4). 
Users are served up candidates one at a time and presented 
with the candidate's PRESTO diagnostic plots (e.g., Fig. 6). 
Each candidate is given an integer rating from 1 to 5, with 
higher numbers corresponding to a higher confidence that the 
candidate is a bona fide pulsar detection. Clear pulsar detections 
are then compared to the ATNF catalogue pulsars to check if it 
is a known pulsar. If a detection is unknown, candidates listed 
in other surveys are then checked using the Pulsar Survey 
Scraper tool.8 If the pulsar is in either the ATNF catalogue or 
in another survey’s candidate list, a note is made against the 
candidate with the pulsar’s name, visible to all other users. 


8See https://pulsar.cgca-hub.org/ 
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Figure 8. The theoretical array factor (proxy for sensitivity) in the vicinity 
of PSR J0026-1955 for observation 1226062160, assuming a true position 
(centre of image) derived from GMRT imaging (cf. Paper II for details). Red 
crosses mark the position of beams in which it was detected, and the blue 
dot marks the first detection. A single cross may indicate multiple detec- 
tions with slightly different periods and DMs. 
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Each candidate can be ranked by multiple users (but users 
can only rank each candidate once). A candidate that has been 
rated by at least four users becomes eligible for follow up, and 
the list of eligible candidates is ordered by the average rating. 

Currently, as the number of users of the system is still rela- 
tively small, the rating of candidates is the primary bottleneck 
in the whole processing chain. This means that during first- 
pass processing, interesting candidates have been followed up 
immediately. In the future, however, as the number of users 
performing the task of rating candidates grows, the pool of 
eligible candidates may grow faster than the rate at which they 
can be followed up. However, the above system of candidate 
prioritisation means that the most interesting candidates are 
always followed up first. 


3.4 Data management and web-app 


The large number of generated candidates, che complex meta- 
data associated with them, and the desire to distribute the tasks 
of data processing, candidate rating and candidate follow-up, 
motivated the implementation of a relational database to track 
the progress of the SMART survey and coordinate processing 
efforts. The database, implemented in PostgreSQL, is com- 
prised of a set of tables containing metadata for 


1. MWA observations (e.g., primary beams, tied-array beams, 
candidates); 

. software (e.g., for beamforming, searching, ML classifica- 
tion), including versioning information; 

. candidate ratings; 

. pulsars; 

. users; and 

. supercomputer facilities. 


N 


ONU! UO 


The users, along with their database access privileges and au- 
thentication, are managed by a subset of tables which interface 
with website front end implemented in Django. Both the 
database and the website are hosted by Data Central.) 

Once an observation has been processed and the candi- 
dates have been subjected to the first-pass ML cull (§3.3.1), 
both the metadata of the remaining candidates as well as the 
candidates themselves (i.e., PRESTO .p£d files and the asso- 
ciated diagnostic plots) are uploaded to Data Central. The 
uploaded candidates are then available for users to rate via the 
web interface (§3.3.3). 

As described above, candidates can then be sorted by their 
average rating, and followed up at will by any authorised user. 
Before following up a candidate, the user may “claim” it by 
clicking a button in the candidate list. This feature is designed 
to prevent multiple users from following up the same candidate 
and unnecessarily duplicating effort. The decentralised design 
allows members of the SMART collaboration from different 
research institutions to work through the SMART data set 
without the need for someone to oversee and coordinate the 
different groups’ activities. 


hSee https://apps.datacentral.org.au/smart 
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4. Confirmation and initial follow-up of candidates 
Confirmation and follow-up of promising pulsar candidates 
typically relies on multiple re-observations, often requiring a 
significant amount of telescope time. Fortunately, the SMART 
survey's unique design, where VCS data are retained (unlike 
pre-processed beamformed data), offers flexible reprocessing 
options, allowing us to accelerate important confirmation and 
follow-up procedures. Furthermore, a substantial amount of 
archival VCS data (from past projects) are available for a large 
part of the MWA sky, which can also be suitably exploited for 
further detection and improved localisation. These features 
make the SMART survey distinct from other pulsar surveys. 

In the following sections we outline the main strategies that 
are adopted for confirmation and initial follow-up, including: 
reprocessing of the original observation for improved detec- 
tion; performing a dense grid for improved sky localisation; 
and polarimetry via reprocessing the survey observation for 
full Stokes information and rotation measure (RM) determina- 
tion. Further detailed follow-ups including the use of archival 
data for timing analysis and imaging for improved localisation 
are discussed in the companion paper (Paper II). 


41 Improved detection 


For our ongoing shallow survey, processing the full 80-minute 
observation itself readily provides an avenue for confirmation. 
If the source is genuine and a steady periodic emitter, this 
should result in a three-fold improvement in S/N. The im- 
provement will be reduced if it is an intermittent source; e.g., 
a pulsar with large nulling fraction. Both these possibilities are 
exemplified in Fig. 6, which shows the original discovery plots 
along with the improved detections for PSRs [0036-1033 and 
J0026-1955. The full 80-minute observations (42 TB) con- 
taining the original detection can be processed and searched 
over a restricted range in P and DM using the PRESTO 
prepfold routine. The observations were also processed using 
the pdmp routine within PSRCHIVE pulsar data processing 
suite’ (Hotan et al., 2004; van Straten et al., 2012), to provide 
a cross-check and a more accurate DM. This is equivalent 
to undertaking a longer observation for confirmation. For 
many of our candidates, this readily provides effective ways of 
confirming or rejecting a candidate, and eliminates the need 
for securing additional telescope time that most other surveys 
typically require. 


While the long dwell time of 4800s should in principle 
result in an increased sensitivity to sporadic or intermittent 
pulsars, our current first-pass processing does not necessarily 
benefit from this. Given this, the discovery of PSR J0026-1955 
in the first 10 minutes of observations, a pulsar with long- 
duration nulls and a nulling fraction of ~77%, was remarkably 
fortuitous (see Fig. 6). Details of the discovery, including an 
analysis of sub-pulse drifting, are reported in McSweeney et al. 
(2022). As mentioned therein, this pulsar turned out to have 
already been reported as a candidate in the GBNCC survey 


‘See https://sourceforge.net/projects/psrchive/ 
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but was blindly (and independently) discovered in the SMART 


survey data. 


4.2 Improved positional determination 


As outlined in §2.3, the tied-array beam size for SMART is 
~ 23'. Therefore a more accurate position is essential both for 
improved detection (i.e., re-beamforming on a more exact sky 
position) and to facilitate effective follow-ups with other (and 
more sensitive) telescopes, particularly at higher frequencies 
where the beams are narrower, even with single-dish tele- 
scopes such as Parkes. This would typically involve making 
multiple re-observations to form a grid around the nominal 
candidate position. The SMART survey design where the sky 
is densely sampled (at a rate comparable to, or slightly better 
than, the Nyquist; Fig. 1), allows this to be achieved via re- 
processing of the original survey observation, where a dense 
grid of pointings encompassing the initial position is used for 
improved positional determination. An example is shown in 
Fig. 9 for the case of PSR J0026-1955. In general, for an initial 
detection with a modest significance of S/N ~ 10, we may 
expect a positional accuracy ~1-2' through this exercise. In 
practice, archival VCS data, if available, can also be suitably 
exploited to progressively further improve the position. In an 
ideal scenario, where data recorded from all three different 
configurations are available, an improvement of the order of 
nearly two orders of magnitude can be achieved through this 
procedure, as demonstrated in Swainston et al. (2021). 


4.3 Polarimetry 


The VCS recording allows the reprocessing of discovery obser- 
vations to generate full polarimetric beamformed time series, 
which can be analysed using standard pulsar packages such as 
DSPSR! (van Straten & Bailes, 2011) and PSRCHIVE, for full 
Stokes profiles. These beamformed MWA data were obtained 
using the procedures described in Ord et al. (2019) and Xue 
et al. (2019). The Faraday rotation measure synthesis tech- 
nique (Brentjens & de Bruyn, 2005) can then be applied to 
estimate the rotation measure (RM). 

As an example, Fig. 10 shows polarisation data for pul- 
sar J0026-1955, obtained by reprocessing the original dis- 
covery observation. This yielded an RM estimate of 3.65 + 
0.09 rad m. After correcting for Faraday rotation, linear and 
circular polarisation was detected. The pulsar exhibits signifi- 
cant amount of linear polarisation but only a small amount of 
circular polarisation. We attempted to fit the rotating vector 
model (Radhakrishnan & Cooke, 1969) to the position angle 
(PA) of the linear polarisation across the on-pulse window, 
in order to constrain the viewing geometry, («, 8), where « 
is the angle between the magnetic and rotation axes, and f 
is the impact angle of the magnetic axis on the line of sight. 
In the absence of relativistic effects, the PA curve is expected 
to be steepest in the center of the pulse profile, with slope 
dw/dà = sin o/ sin B ~ 2.4, where w is the PA at phase d. 


JSee https://sourceforge.net/projects/dspsr/ 


N. D. R. Bhat et al. 


5. Survey simulations and forecast 

The ongoing first-pass processing (i.e., essentially a shallow sur- 
vey for long-period pulsars) is limited to processing only a frac- 
tion (1/8th) of our observation time over coarser (sub-optimal) 
trial DM values, out to a maximum DM of 250 pc cm^, and 
to basic periodicity search. In the second pass we will extend 
this to full 80-min observations and employ more optimal DM 
steps. Besides a three-fold increase in sensitivity expected for 
long-period pulsars (by virtue of longer integration times), 
substantial improvements in sensitivity is also expected to mil- 
lisecond pulsars via finer DM steps and optimal dedispersion 
plans to match our 100-us/10-kHz resolutions. These con- 
siderations motivated our simulation analysis to make some 
meaningful forecast of the expected survey yield, both for 
long-period pulsars and MSPs, as summarised below. They 
provide further justification to undertake a full-scale search 
processing, planned as part of second-pass processing. 


5.1 Long-period pulsars 


The discovery oftwo new pulsars from the processing of a small 
fraction of survey data hints at the potential for many new pul- 
sar discoveries from a deeper survey that will take advantage of 
the full 80-min observation. To estimate the survey yield, we 
have performed survey simulations, using the formalism out- 
lined in Xue et al. (2017). The analysis made use of the popular 
simulation package PsrPopPy (Bates et al., 2014) that was de- 
veloped from the original pulsar simulation software PSRPOP 
by Lorimer et al. (2006). The simulations take into account the 
sky dependence of the system temperatures at low frequencies 
(Toky ox v7255). as well as the loss in the array gain (G) expected 


at large zenith angles, modelled as G(0.) = Gmaxcos(02), 
where 0. is the zenith angle and Gmax is the gain at 0z = 0. We 
simulated a population of 1.6 x 10° Galactic canonical pulsars, 
extrapolated from Parkes Multi-beam Pulsar Survey (Manch- 
ester et al., 2001) detections. The luminosity distribution of 
the canonical pulsar population follows a log-normal distribu- 
tion (log, )L) = -1.1, ollog, L] = 0.9 , where L is the radio 
psuedo-luminosity in units of mJy kpc? (Faucher-Giguére & 
Kaspi, 2006). The Galactic radial density distribution follows 
the Yusifov & Küçük (2004) model. With the caveat that our 
understanding of the pulsar luminosity function and beam- 
ing fraction is limited, we project the deep survey to reach a 
limiting sensitivity of ~2-3 mJy, with a potential net yield of 
310 + 100 new pulsar discoveries (see Fig. 11). This projec- 
tion mainly applies to the population of long-period pulsars 
and does not account for other classes of pulsars such as spo- 
radic emitters (e.g., RRATs), or millisecond and binary pulsars, 
whose populations are hard to model or simulate. 

Assuming an isotropic distribution of our simulated lo- 
cal pulsar population (DM < 250 pc cm"), and scaling for 
the current (first-pass) search sensitivity (i.e. one-third of the 
deep pass sensitivity), and the fraction of data for which the 
candidate scrutiny has been completed (~5%), we may ex- 
pect ~3-5 pulsars. The detection rate at this early stage of 
SMART thus appears to be in line with this general expecta- 
tion. While this may seem fortuitous, the unique advantages 
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Figure 9. MWA localisation of PSR J0026—-1955 by performing a dense grid around the initial pulsar position from the discovery observation. The source 
position (RA, Dec) = (00^26"37.5*, -19956/24.9") is = 32! offset from uGMRT-determined position (cf. Paper II for further details). Observations were 
made using the extended MWA array (Phase II, with ~6 km maximum baseline). The uncertainties in the MWA position is ~ 12” (i.e., about one-tenth of the 


tied-array beam size, shown as dashed circles on the left panel). 


of T r T r T r T r 
w OT 
9 [ 
ZoL 
< of vw 
m OF 
IE + } + } + + } + 
L J0026—1956 154.24 MHz 
af 
e = 
x 9| 
3 
E Lr 
- dy (XQ MM OQ DAL IAN Laveen i ows 


0 0.2 


0.4 
Pulse Phase 


0.6 


Figure 10. Polarimetric profiles of PSR J0026—1955 obtained by reprocess- 
ing the discovery observation at 155 MHz. The black, red, and blue curves 
in the lower panels show the total intensity, linear, and circular polarisation, 
respectively. An RM estimate of 3.65 + 0.09 rad m^? was obtained, and the 
data were corrected for Faraday rotation. 


of the SMART pulsar survey, especially the accessibility to the 
southern hemisphere, the radio-quiet environment, and the 
survey parameters (e.g.. long dwell times and high time/fre- 
quency resolutions), offer excellent prospects for new pulsar 
discoveries, provided the substantial processing challenges can 


be addressed. 


5.2 Millisecond pulsars 


Even though the detection sensitivity to MSPs is significantly 
reduced in our current shallow pass of the survey (owing to 
the use of coarse or sub-optimal DM step sizes; see Fig. 2), 
the second-pass processing, where we plan to employ more 
optimal DM searches with a finer step-size in DMs, is ex- 
pected to yield a substantial improvement in sensitivity, par- 
ticularly at low to moderate DMs, out to <50pecm™>. At 
DMs 270 pc cm^, and especially in regions near the Galactic 
plane and toward the centre, scatter broadening is expected to 
result in sensitivity degradation, given the strong frequency 
dependence (pulse broadening time, ty oc v7; cf. Bhat et al., 
2004), due to which ty 210ms, which, for millisecond pul- 
sars, can be a substantial fraction of the rotation period. Using 
PsrPoPy, we simulated a population of 3 x 10+ MSPs with P 
and DM distributions essentially derived from the HTRU in- 
termediate latitude pulsar survey (Levin et al., 2013), and with 
a luminosity limit of L41400 ~ 0.2 mJy kpc?. This corresponds 
to a limiting flux density ~10 mJy at 150 MHz, assuming a 
spectral index of « = —1.8 (and a distance of ~1 kpc), and thus 
in principle detectable provided there is no significant degra- 
dation from dispersive smearing or temporal broadening from 
scattering. 


As with the population of long-period pulsars, this analysis 
accounted for the sky dependence of Ty, non-uniformity 
in the array gain, and strong frequency scaling of scatter- 
ing, which is especially important for MSPs. For example, 
using some preliminary dedispersion plan estimates for the 
second round of processing (i.e., the deep search), where we 
assume a typical plan would involve DM steps of 0.01 pc cm? 
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Figure 11. Simulated pulsars detectable (colour filled circles) in an an all-sky high-time-resolution pulsar search with the MWA in the 140-170 MHz band. The 
shaded region represents the MWA's visible sky, i.e., the sky south of +30° in declination. The black filled circles represent known pulsars in the ATNF pulsar 
catalogue (version 1.67). The colour scale indicates the DM in units of pc cm. 


up to 54pccm^? and 0.02 pc cm^? out to 107 pecm™, our 


simulations predict 55 detectable MSPs above our detection 
threshold, and hence ~15 new MSP discoveries. However, 
a substantial increase is forecast in simulations that closely 
emulate the higher sensitivity attainable through more opti- 
mal searches that make use of coherent dispersion measure trials 
(CDMT), which is equivalent to the use of finer DM steps 
of 0.002 pc cm”, and will limit residual DM smearing to 
~ 150 us (comparable to ~ 100 ps native resolution of the 
VCS). In essence, this means that full-scale, high-sensitivity 
searches employing the implementation of CDMT, if feasible 
for SMART, can potentially lead to the discovery of as many 
as ~30 MSPs. 


The simulated population of ~70 MSPs, along with the 
simulated population of long-period pulsars (see $5.1), is shown 
in Fig. 12. Our simulation analysis did not consider a large 
population of MSPs discovered in recent (and highly success- 
ful) Fermi-directed targeted searches (Deneva et al., 2021, 
and references therein). Even so, the detectable population of 
MSPs is almost twice the currently known population within 
DM < 100pecm™, which means a net MSP yield that is 
competitive to that from the highly successful Parkes HTRU 
survey. Indeed, as evident from Fig. 12, the detectable pop- 
ulation of MSPs is limited to DM € 70pccm^, which is 
reconcilable given the expected pulse broadening times of 
t4 210 ms toward such moderate-DM pulsars at the low fre- 
quencies of the MWA (e.g., Kirsten et al., 2019). Consequently, 
the vast majority of MSPs discovered will likely be suitable 
for high-precision timing applications such as pulsar timing 
arrays. 


6. Future processing plans 

The planned second-pass survey will extend the processing 
to the full 80-min observations and carry out more optimal 
searches in the DM parameter space, while incorporating 
searches for both long-period pulsars and millisecond pulsars. 
As such, the long dwell times of SMART (48005) can be ex- 
ploited to search for pulsars with very long periods, like those 
discovered by LOFAR and MeerKAT (Tan et al., 2018b; Caleb 
et al., 2022), and provide increased sensitivity to objects that 
emit intermittently, e.g. pulsars with long null duraions such 
as PSR J0026-1955 (McSweeney et al., 2022). In addition, the 
adopted strategy to archive recorded voltages offers additional 
avenues for future processing; e.g. searches for millisecond 
pulsars through the application of novel hybrid dedispersion 
approaches that involve the use of coherent dispersion mea- 
sure trials (CDMT), which was demonstrated by the LOFAR 
through the discovery of PSR J0952-0607 (Bassa et al., 2017). 
Below we outline our processing plans and strategies in the 
near-term and highlight some of the computational challenges 
and other considerations in planning this second-pass process- 


ing. 


6.1 Beamforming and sensitivity optimisation 

As discussed earlier in $2.5, the tied-array beamforming strat- 
egy warrants some more careful thought in order to maximise 
sensitivity while also reducing needless processing. Inevitably, 
this produces an uneven sensitivity threshold across the sky 
due to both primary beam pointing effects and effective dwell 
time. These considerations are also important when estimating 
survey-wide statistics. We are formulating a more efficient 
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Figure 12. Simulated pulsars detectable in an all-sky pulsar search with the MWA's 140-170 MHz band with a dwell time of 4800 seconds. The shaded region 
represents the MWA's visible sky, i.e. the sky south of +30° in declination. The black filled circles denote the long-period pulsars, whereas millisecond pulsars 
detectable in high-sensitivity searches (e.g., using the CDMT) are shown as colour filled circles. The colour scale indicates DM in units of pc cm. 


beamforming scheme that takes into account these technical 
details, which will be presented in a subsequent paper detailing 
the second-pass survey processing. 


6.2 Dedispersion planning and RFI mitigation strategies 
For the first-pass survey processing described in this paper, 
the dedispersion plan outlined in Table 2 is adequate for all 
observations. In contrast, a slightly more sophisticated plan 
may be required for the second-pass processing to accom- 
modate the eight-fold increase in observation length and to 
provide increased sensitivity to shorter-period pulsars. We 
are actively developing a sensible strategy that balances our 
sensitivity goals and the relatively large computational costs 
associated with dedispersing MWA VCS data, especially since 
we would essentially be producing ~10x as many DM trials. 
In addition to revisiting the dedispersion plan, we will also 
incorporate a more careful approach to excising or mitigating 
RFI (both periodic and impulsive). The observatory site is 
exceptionally RFI-quiet (owing to the geographical location 
and radio-quiet zone status), hence the first-pass processing 
did not include any active RFI mitigation other than what 
is naturally gained by forming TABs (where off-axis RFI is 
“phased out”). We are currently examining the periodic RFI 
environment by processing observations taken throughout the 
SMART observing semesters and using a standard PRESTO- 
based approach to find bright, common terrestrial signals by 
searching for periodic “candidates” in che zero-DM topocentric 
time series data. Once we collect this information, we will 
apply the masks (after appropriate barycentric corrections are 
made) during the periodicity search pipeline. Additionally, 


there can occasionally be bursts of narrow-band interference 
(e.g., air-craft and satellites in TAB grating lobes) that could 
severely affect our data quality for short periods of time. There 
are several software pre-processing solutions to this kind of 
RFI (e.g., Eatough et al., 2009; Men et al., 2019; Morello et al., 
2022), which we will explore in parallel to the periodic RFI 
mitigation strategies. Empirically, VCS data are remarkably 
clear of impulsive/narrowband RFI in the SMART observing 
band, and data excision is «10% for a typical observation. 


6.3 Searches for long-period pulsars and sporadic emitters 


The long dwell times of SMART make it particularly amenable 
to the application of fast-folding algorithms that offer signifi- 
cantly higher sensitivity to pulsars with rotation periods 210s 
(e.g., Morello et al., 2020). Such slow-spinning pulsars are 
likely to be near the radio emission ‘death lines’ so can be in- 
valuable in gaining useful insights into the intricacies of the 
pulsar radio emission process. Recent applications of this algo- 
rithm in Parkes and Arecibo searches have led to the discoveries 
of pulsars with P >10s and (Morello et al., 2020), or very weak 
pulsars (S1400 ~ 10 wy) with a ~2% duty cycle (Parent et al., 
2018). These, and other recent discoveries such as a 76-s pulsar 
with MeerKAT (Caleb et al., 2022), provide a strong motiva- 
tion for undertaking fast-folding searches. The low levels of 
RFI at the observatory site are particularly advantageous for 
this. 

The SMART survey dwell time is substantially longer than 
those of previous-generation southern-sky surveys, particu- 
larly at high-lbl parts of the sky, where it is 20 times longer 
than the HTRU survey (Keith et al., 2010) and 40 times longer 
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than the southern pulsar survey (Manchester et al., 1996). It is 
also 40 times longer than that of the ongoing GBNCC survey 
(Stovall et al., 2014) that covers the sky north of —55? in dec- 
lination (Table 1). Considering this, detection prospects are 
promising, especially given the ~2-3 mJy limiting sensitivity 
that the SMART can attain for long-period pulsars (§2.4) and 
negligible degradation in signal strength due to dispersion and 
pulse broadening effects. 

As described earlier, the long dwell times also increase 
the search sensitivity to objects that emit sporadically, such as 
RRATs and giant-pulse emitters (e.g., the Crab pulsar), which 
can be more effectively detected by searching for individual 
dispersed pulses, and will be part of the second-pass processing. 


6.4 Searches for binary and millisecond pulsars 


The long dwell times, and high time and frequency resolu- 
tions, of the SMART can also be exploited, in principle, to 
search for binary and millisecond pulsars. However, a full- 
scale acceleration search can be prohibitively expensive at the 
low frequencies, given the very large number of DM and ac- 
celeration trials that are required (e.g., typically ~ 104 up to 
250 pc cm^, and ~ 2400 across +100 m s?). Compared to 
the HTRU-south low-latitude survey, which has been success- 
ful in finding such systems (e.g.. Cameron et al., 2020), the 
cost of searching SMART data can be more than an order of 
magnitude greater. The successful detection of several MSPs 
and the double pulsar in our initial census (cf. Paper II) makes 
such searches worthwhile. 

An inherent limitation in the searches for such short-period 
pulsars is the significant degradation in sensitivity due to sub- 
stantial dispersion smearing (relative to rotation periods) de- 
spite our 10-kHz channels. Fortunately, this can be alleviated 
by using CDMT-based searches (Bassa et al., 2017). Record- 
ing in 24 1.28-MHz channels makes the SMART data highly 
amenable to the application of CDMT searches, and can re- 
sult in a substantial increase in detection sensitivity to short- 
period millisecond pulsars. Integration of this novel method, 
and benchmarking on prospective HPC clusters with signifi- 
cant computational resources (e.g., Pawsey’s emerging Setonix 
cluster) is also part of our future processing plans, although a 
full-scale processing may have to await access to sufficient com- 
putational resources. We are also exploring publicly available, 
GPU-enabled Fourier domain acceleration search software 
(e.g., AstroAccelerate; Armour et al., 2020) as a drop-in re- 
placement for PRESTO’s CPU-based accelsearch. 

Regardless, the high cost of such computationally-intensive 
searches will likely necessitate a multi-pass processing strategy; 
for instance, an initial pass involving acceleration searches, 
but limited to a modest number of acceleration trials (e.g.. 
7-150 to cover +6 m s^), thereby retaining sensitivity to short- 
period objects (P < 10 ms) but with the binary orbital period, 
P, = 5 days (i.e., with low-mass white-dwarf type compan- 
ions). Full-scale acceleration searches that target binary systems 
such as PSR J0737-3039 or PSR J1757-1854 with Pj < 5hr 
(i.e., requiring ~2400 trials spanning across +100 m $2) are 
hence deferred to the longer-term future. Such searches will 
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be primarily limited to the regions around the Galactic plane, 
at least initially, thus processing only a fraction of the SMART 
data (e.g., sky within Ib] < 5°). Such a multi-pass strat- 
egy is also motivated by the demonstrated success of HTRU- 
south, which has led to the discovery of exotic systems such as 
PSR J1757-1854 (Cameron et al., 2018) and wide-orbit double 
neutron-star system (Sengar et al., 2022). In any case, notwith- 
standing the high computational cost, the high-profile scien- 
tific applications of such rare systems make similar full-scale 
acceleration searches scientifically compelling for the SMART 
data. The long-term scientific dividends of such systems are 
vividly demonstrated by Kramer et al. (2021) through the 16- 
year timing analysis of the double pulsar, enabling the most 
stringent tests of general relativity and alternative theories of 
gravity. 


7. Summary and conclusions 


With its novel features such as voltage recording and long dwell 
times, and access to the pristine radio-quiet environment in the 
southern hemisphere, the SMART survey is well positioned 
to play an impactful role in the exploration of the southern, 
low-frequency sky for pulsar surveys and science. Since the 
MWA is a precursor for SKA-Low, the SMART survey will 
also serve as an important preparatory step for pulsar surveys 
planned with SKA-Low. Additionally, it will map out the 
southern sky for low-frequency detections of many pulsars 
that were originally discovered at frequencies 2400 MHz. 

The survey is enabled by the advent of the Phase II up- 
grade of the MWA, the compact configuration of which offers 
an enormous gain in the beamforming and processing cost, 
thereby making large all-sky pulsar surveys tractable with 
large-FoV interferomtric arrays such as the MWA. The com- 
bination of voltage recording and the FoV brings a survey 
efficiency of ~450 deg? hr! , but at the expense of large data 
rates of 28 TB hr^!. Consequently, ~3 PB of (VCS mode) data 
for the full survey and significant processing costs. 

Due to the substantial computational cost involved in search- 
ing at low frequencies, the processing is undertaken in multiple 
passes. In the ongoing first-pass processing, 10 min of data 
from each observation are processed in 2358 trial DMs, out 
to a maximum DM of 250 pc cm”, thereby reaching about 
one-third of the sensitivity that will eventually be attainable 
in full observation processing. 

The voltage recording strategy adopted for the SMART 
survey enables a multitude ofavenues for follow-ups and confir- 
mations, including improved detection, initial polarimetry and 
arcminute-level positional determination — all by reprocessing 
the original observation and, where possible, also archival VCS 
data. This also facilitates timely follow-up studies using more 
sensitive telescopes such as Parkes and uGMRT that operate at 
frequencies 2300 MHz. 

With the recent development of a web-app for facilitating 
efficient scrutiny of candidate analysis, including classifica- 
tion and ranking for identifying promising ones to follow-up, 
we anticipate the discovery rate to increase in the coming 
years. As software tools mature and the search pipelines are 
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expanded to include acceleration trials and fast-folding based 
algorithms, and additional computational resources become 
available, it will become possible to extend the processing to 
include searches for binary and millisecond pulsars, and those 
with very long periods or even sporadic emitters. Our sim- 
ulation analysis forecasts a survey yield of ~300 long-period 
pulsars and ~30 millisecond pulsars by the completion of full 
processing. The SMART survey data will serve as a complete 
digital record of the low-frequency southern sky, and an im- 
portant reference for even more ambitious surveys planned 
with the SKA-Low. 
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